Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Improve quantization flow #15961

Merged
merged 15 commits into from
Aug 29, 2019
Merged

Conversation

ZhennanQin
Copy link
Contributor

Description

@pengzhao-intel @TaoLv @xinyu-intel @reminisce @KellenSunderland @anirudh2290
Major changes:

  • Don't need calib_layer, the layer list which needs calibrated will be generated from quantization pass, so user don't need to specify that. Only needed layer get calibrated, thus improve the calibration speed.

  • Entropy is refactored, only histogram of output is saved, which will help to reduce memory consumption. Entropy calculation is refactored as c++ operator, accuracy is improved, then entropy method can get same speed as naive.

Model FP32 Accuracy INT8 Naïve INT8 entropy_old INT8 entropy_new
ResNet50-V1 76.340% 76.060% 76.006% 76.053%
Squeezenet 1.0 56.980% 56.790% 55.584% 56.997%
MobileNet 1.0 72.230% 72.060% 71.822% 71.822%
MobileNetV2 1.0 70.270% 69.820% 69.950% 70.016%
Inception V3 77.760% 78.050% 78.019% 78.053%
Inception-BN 72.280% 72.020% 72.084% 71.978%
mean_for_all 70.977% 70.800% 70.578% 70.820%
  • Add new quantization mode smart, which will automatically decide each op should be quantized or not. This mode will only quantize nodes which have performance benefit(e.g. convolution and FC), and necessary nodes. For example, A is convolution or FC, which will be all quantized. B is Relu or Add, which is quantizable and quantization flow will make decision whether to quantize it or not. C is non-quantized node. For A -> B -> A, B will be quantized as it can pass down int8 data. For C->B->C, A -> B -> C, or C -> B -> A, B won't be quantized.

  • Add log for quantization flow, this can help to user to understand what quantization flow does and what's changed.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2
Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee
Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577
@pengzhao-intel
Copy link
Contributor

@KellenSunderland the requests from your team :)

@ZhennanQin ZhennanQin force-pushed the smart_quantize_fast branch 2 times, most recently from ad36eb0 to 98261b4 Compare August 22, 2019 00:51
Change-Id: Ia38369d31c33d0f76a671275910729dfce693950
@pengzhao-intel
Copy link
Contributor

@ZhennanQin @xinyu-intel please rebase the code and retrigger the CI

xinyu-intel and others added 8 commits August 26, 2019 21:17
Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534
Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e
Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d
Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e
Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f
Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86
Copy link
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not easy to pass the CI.

Merging now for the customer request.

@pengzhao-intel pengzhao-intel merged commit 3f7b6ee into apache:master Aug 29, 2019
@ZhennanQin ZhennanQin deleted the smart_quantize_fast branch September 16, 2019 07:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants